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ABSTRACT 





In this paper we have compared Non parametric post hoc tests with adjusted p value for easy application purpose. 
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1. INTRODUCTION: 

When data consist of one nominal variable and one measurement variable, usu- 
ally one way ANOVA is used but when the measurement variable does not meet 
the normality assumption of a one-way ANOVA then parametric method is not 
applicable and when original data set actually consists of one nominal variable 
and one ranked variable; we cannot apply ANOVA. The non-parametric tech- 
niques which have been developed for k sample problem require no assumptions 
beyond continuous populations and therefore it is applicable under any circum- 
stances. 


One of the assumptions of the parametric analysis is that the variability is approx- 
imately the same across all groups. If this assumption does not hold then 
researcher should first try to transform the response variable, perhaps using a log 
or square root transformation. Hopefully this will be stabilizing the variance 
across the groups. However, in certain situations none of the transformation 
resolves this problem. In this situation also researcher should consider using a 
non parametric test (Newman, 1995). 


Ifthe normality assumption 1s violate or the sample sizes from each of the k popu- 
lations are too small to assess normality, Kruskal Wallis (K W) test is used to com- 
pare the distribution of different populations. Kruskal Wallis (K W) test is the non 
parametric equivalent to the omnibus F test in a one way ANOVA (which is used 
with matrix dependent variable). KW test is used when the dependent variable 
consist of ranks. It tests the null hypothesis that the location of each group is the 
same in the population. If the null hypothesis is rejected, then at least one of the 
locations is different from the others. When the KW test is significant, perform 
follow up pair wise tests. 


It is important to realize that Kruskal-Wallis test is an omnibus test that enables to 
test the general hypothesis that all population medians are equal but cannot tell 
which specific groups of independent variable are statistically significantly dif- 
ferent from each other; it only tells that at least two groups are different. But the 
researcher is not just interested in this general hypothesis but in comparisons 
amongst the individual groups. Since we may have three, four, five or more 
groups in our study design, determining which of these groups differ from each 
other is important and then post hoc test is used 


There are two ways to apply non-parametric post hoc procedures, the first being 
to use Mann Whitney tests. However, if we use lots of Mann Whitney tests, Type 
I error rate will inflate, therefore not preferable. However, if we want to use lots 
of Mann—Whitney tests to follow up a Kruskal—Wallis test, we can if we make 
some kind of adjustment to ensure that the type I errors don't build up to more 
than .05. The easiest method is to use a Bonferroni correction, which in its sim- 
plest form just means that instead of using .05 as the critical value for signifi- 
cance for each test, you use a critical value of .05 divided by the number of tests 
conducted. 


So in this research paper, we will discussed and compare Bonferroni method and 
modification of it like Holm, Hochberg, Hommel, Holland and Rom. All this 
methods are based on adjusted p value. 


An adjusted p value is defined as the smallest significance level for which the 
given hypothesis would be rejected, when the entire family of tests is considered. 
The decision rule is to reject the null hypothesis when the adjusted p-value is less 
then a; in most cases, this procedure controls the FWE at or below a level. 


In this research paper, we have discussed tests based on adjusted p values such 
that, if the adjusted p value for an individual hypothesis is less than the chosen sig- 


nificance level a, then the hypothesis is reyected with FWE not more than a. It 
includes Bonferroni procedure and modification of that procedure by Holm, Hol- 
land & Copenhaver, Hommel, Hochberg and Rom. From them some of the meth- 
ods are Single step procedure and others are sequential methods. Further Sequen- 
tial methods can be categorized in two ways 1.e. Step up method and step down 
method. 


Single Step/Simultaneous Procedure: 

It is also called Single Step (SS) procedure. The single step procedure sets a sin- 
gle criterion for testing all individual hypotheses. SS procedure conducts all com- 
parisons regardless of any other comparison is significant or not using a constant 
critical value (Einot & Gabriel, 1975). SS procedures are valid to use both for 
hypothesis testing and to calculate confidence intervals. 


In SS procedure, the decision about any hypothesis Hi does not depend on the 
decision about any other hypothesis Hj therefore the hypotheses can be tested 
without reference to one another. 


Sequential Procedure: 

It is also called Step Wise (SW) procedure. A step wise procedures consider 
either the significance of the omnibus test or the significance of other compari- 
sons or both in evaluating the significance ofa particular comparison. 


In SW procedure, the hypotheses are tested in a specific order, generally deter- 
mined by the magnitudes of the test statistics or the associated p-values, pi and 
the decisions on them are made in a stepwise manner. The decisions on the earlier 
hypotheses in the order may affect those on the later hypotheses in the order. 


A major disadvantage of stepwise (multi-stage) method is that it does not allow 
the construction of confidence interval, which is extremely useful for the inter- 
pretation of the results. 


SW procedures can be further subdivided into 2 categories. 
1. Step Up procedures (SD) 
1. Step Down procedures (SU) 


Step Up procedures: 

In SU procedure, the hypotheses are tested beginning with the least significant 
one and testing continues until a hypothesis is rejected at which point all the 
remaining hypotheses are rejected by implication without actually testing them 
(Tamhane & Dunnett, 1999). 


SU procedure begins by testing all minimal hypotheses and then steps up through 
the hierarchy of implying hypotheses. If any hypothesis is rejected, then all of its 
implying hypotheses are rejected without further tests; thus a hypothesis is tested 
ifand only ifall of its implied hypotheses are retained. 


Step Down Procedure: 

A step down procedure begins by testing the overall intersection hypothesis and 
then steps down through the hierarchy of implied hypotheses. If any hypothesis 
is not rejected, then all of its implied hypotheses are retained without further 
tests; thus a hypothesis 1s tested if and only if all of its implying hypotheses are 
rejected. In SD Procedure, the hypotheses are tested beginning with the most sig- 
nificant one and testing continues until a hypothesis is not reyected at which point 
all the remaining hypotheses are accepted by implication without actually testing 
them. 
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2. Non parametric Tests: 
These methods can be classified on the basis of the either simultaneous or 
sequential procedure. 





Adjusted P- Value 














method 
Single Step Step Wise 
Bonferroni Step Up Step Down 
Hommel _ 
Holm 
Hochberg 
Holland 
Rom 


2.1. Bonferroni Test (1961): 

The Bonferroni method applies to both continuous and discrete data. This 
method is flexible because it controls the FWE for tests of joint hypotheses about 
any subset of m separate hypotheses (including individual contrasts). The proce- 
dure will reject a joint hypothesis HO if any p-value for the individual hypotheses 
included in HO is less than o/c. Bonferroni method, however, yields conservative 
bounds on Type I error hence it has low power. This procedure controls the FWE 
at 0 without any further assumption on the dependence structure of the p value. 
This method is design for comparisons involving pair wise comparisons as well 
as combinations of means, provided the number of comparisons to be made is 
fixed in advance. It is recommended for non-orthogonal contrast because it splits 
the type I error rate equally among all comparisons. 


The Bonferroni procedure is used for evaluating a small number of contrast that 
is selected prior to observing the data while preserving a selected family wise 
type I error rate (Ott & Longnecker, 2010). The researcher must have sufficient 
theory about the phenomena of interest in order to know which contrasts to spec- 
ify. It is appropriate when the number of comparisons exceeds the number of 
degrees of freedom between groups. This method controls type I error but it will 
increase type II error. The purpose of Bonferroni procedure is to reduce the prob- 
ability of identifying significant results that do not exist, that is, to guard against 
making Type I error in the testing process (Mchugh, 2011). These potential for 
error increases with an increase in the number of tests being performed in a given 
study and is due to the multiplication of probabilities across the multiple tests 
(Mchugh, 2011). 


Test statistics: 
¥,-¥, 
i — rs exo I) 





Where, 
Ms, = Error mean square from the ANOVA. 


X,and X,are the two means being compared. 
n, and n, are the respective sample sizes from population i andj. 
Critical Value: 
Lab = Las ’ df, ee 


Where, 
tis the value from the t distribution. 


c=number of pair wise comparisons in the family. 


For complex comparison: 
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Decision procedure: 
Reject the null hypothesis ift,,,>t,,,; do not reject H, otherwise. 


cal — 





Confidence Interval: 
| a oe. es (A) 
Here, margin of error depends on the number of comparisons. 
Confidence interval for the contrasts: 
>» Cm, tf a/2,c.df, weld) 





i=] 


Advantages: 
1. This method is highly flexible because it can be applied to test any subset of 
hypotheses for continuous, discrete data and even correlated tests. 


2. This method is used for any design. 


3. When number of comparisons are small (i.e. number of comparison less 
than or equal to number of groups-1) it gives smaller confidence interval. 


4. This method is useful in confirmatory research when a family of selected 
pair wise comparisons is specified prior to data collection; it reduces the 
problem of alpha inflation (Mchugh, 2011). 


5. Bonferroni method can also test complex pairs. (Mchugh, 2011). 


6. This method has relatively good power for small sets of planned comparison 
(Toothaker, 1993). 


Disadvantages: 
1. Itcannot be used for data snooping because the tests of interests are specified 
prior to data analysis. 


2. This method has lower power to reject an individual hypothesis and it lacks 
power if several highly correlated tests are undertaken (Simes, 1986; Li, 
2009; Hommel, 1988). 


3. It has power to quickly decline as the number of comparisons increases 
(Olejnik et al., 1997; Toothaker, 1993). 


4. Itisnotatool for exploratory data analysis. 


5. The test does not take into account whether the findings are consistent with 
theory and past research. If consistent with previous findings and theory, an 
individual result should be less likely to be a Type I error. 


2.2. Holm Test (1979): 

Itis amodification of Bonferroni procedure that yields a more powerful test. The 
goal of Holm method is to increase the power of the statistical tests while keeping 
under control the FWE. It is a step down procedure. It is also called a sequential 
rejection method because it examines each hypothesis in an ordered sequence 
and the decision to accept or reject the null hypothesis depends on the results of 
the previous hypothesis tests (Tamhane et al, 1998). He was the first to formally 
introduce a sequentially rejective Bonferroni procedure. Bonferroni method 
does not account for the correlations between the test statistics, the Holm 
procedure can be improved. 


Holm method can be applied to almost any data because of its non-parametric 
nature. This test can be applied in any pair wise comparison where the classical 
Bonferroni test is usually applied. It is applicable when pair wise comparisons of 
median or linear combinations or non linear combinations of median are used. It 
is used to perform priori comparison. For several a priori contrasts, not 
necessarily pair wise, it controls FWE while at the same time maximizes the 
power (Howell, 2007). 


Assumptions: 

There are no restrictions on the type of test; the only requirement is that it should 
be possible to calculate the obtained level for each separate test. Further, there are 
no problems to include in the analysis only for the a priori interesting hypotheses, 
while more special multiple tests usually include on all hypotheses of a certain 
kind. 


Holm's procedure may be used either as a protected test or as an unprotected test 
but the protected version is preferred due to the additional power gains. But when 
there exist logical implications among the hypotheses, problems arise which we 
have to take in to consideration (Holm, 1979). So, Holm's procedure makes no 
distributional assumptions, logical assumptions about the hierarchy of the 
hypotheses to be tested and does not assume independence of comparisons 
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Procedure: 
Order the p values, p,, =...... > P.» and denote the corresponding hypotheses, 
H,),--+--5H,.. Start with the smallest p value, p,.. If p,. > a/c, then stop testing and 


accept all the hypotheses; otherwise reject H_,,, and go to the next step. In general, 
if testing has continued to the i,, step (ISi<c) and if p,.;,,, >a/(c-it1), then stop 
testing and accept all the remaining hypotheses, H,,,.,,,....., H/,; otherwise reject 
Hi: and go to the next step. 


In short, this procedure rejects the specific hypothesis H,, for 1 = 1,2,...,c, 


provided both P,, < a/(c-i+1) and H_,,,..., Hg, have all been rejected. 


Like Bonferroni procedure, Holm’s procedure can also modify p-values directly 


multiplying the p-value by the adjusted C-i+1, where i is an index of the step 
associated with the p value. 


For unequal sample size, the test statistics is same as Bonferroni given by....(1) 


For equal sample size, the test statistics 1s given as 








7 x,—X, 6) 
2M... 
Nl 


Calculate t’ for all contrasts of interest and then arrange the 7’ values in increasing 
order without regard to sign. This ordering can be represented as |t’,| <|¢’,] <|z’,| < 
~-<|t',|, where c is the total number of contrasts to be tasted. 


The first significance test is carried out by evaluating ¢, against the critical value 
in Dunn’s table corresponding to c contrasts. In other words, ¢, 1s evaluated at a’ = 
a/c. If this largest ?’ is significant, then we test the next largest ?’ (1.e. ¢",) against 
the critical value in Dunn’s table corresponding to c-1 contrasts. Thus, f',., 1s 
evaluated at a’=a/(c-1). The same procedure continues for f/...), f..s)5 f (--4)-- until 
the test returns a non-significant result. At that point we stop testing. Holm has 
shown that such a procedure continues to keep FWE < a, while offering a more 
powerful test. 


The logic behind the test is that when we reject the null for t,, we declare that null 
hypothesis to be false. If it is false, that only leaves c-1 possibly true null 
hypotheses, and so we only need to protect against c-1 contrasts. A similar logic 
applies as we carry out additional tests. This logic makes particular sense when 
even before the experiment is conducted we know that some of the null 
hypotheses are almost certain to be false. If they are false, there is no point in 
protecting from erroneously rejecting them. 





Critical Value: 
c—i+l ...(7) 
Decision procedure: 
Reject H,, to H,.) if 
Pa < 
(1) ar ...(8) 


a will change at all stages because of its step down nature. 
The critical value of this method is based on the Bonferroni inequality. 


Advantages: 
1. This method is flexible and simple to implement. 


2. It controls the FWE in the strong sense, i.e. it guarantees control of 
generalized Type I error probability to be at most a (Hochberg, 1988; 
Schochet, 2008; Ekenstierna, 2004; Hochberg & Benjamini, 1990; De 
Muth, 2006). 


3. This archives lower type II error while keeping the type I error rate at level 
less than a (Hochberg & Benjamini, 1990). 

4. Itcanbeused for equal as well as for unequal sample size. 

Disadvantages: 


1. Power of this method is small If all the hypotheses are almost true but it may 
be considerable if a number of hypotheses are completely wrong (Holm, 
1979), 


2. Itdoes not compute confidence intervals. 


3. Itdoes not consider the logical interrelationships among the c hypothesis. 
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4. It becomes very conservative when the numbers of comparisons are large 
and when tests are not independent (De Muth, 2006). 


2.3. Holland & Copenhaver Test (1987): 

It uses the Sidak (1967) inequality to set the criterion for each hypothesis test. It is 
a step down procedure. When there is need for further research in situations, 
where there is no logical inter relationship among the hypotheses, this method 1s 
useful. 


Assumptions: 
Positive orthant dependence of the test statistics is satisfied. 


Procedure: 

Let p,,-.-,P. be the ordered p values (smallest to largest) and H_,,...,H,.. be the 
corresponding hypotheses. Suppose i is the smallest integer from 1 to m such that 
p(i) > 1 —(1 —a)'“"”; the Holland-Copenhaver procedure rejects H,, to H,.,, 
and retains H,, to H,, (Olejnik et al,1997). 


Test Statistics: 
For unequal sample size, the test statistics is same as Bonferroni given by ...(1) 


For equal sample size, the test statistics is same as Holm given by ...(6) 


Critical Value: 
1—(1— 7) elie atten) 


Decision procedure: 
Reject H,,,to H,,,,1f 


(i-l) 


ply <1) ...(10) 
Advantages: 

This method is conservative under the condition that the test statistics are 
positive orthant dependent. 


Disadvantages: 
Applicability of this method is slightly less than the Holm procedure because of 
the requirement of positive orthant dependent condition for test statistics. 


2.4. Hommel Test (1988): 

Hommel (1988) employs the closure principle to extend Simes test and 
developed a stepwise multiple testing procedure controlling FWE. It is based on 
the Simes (1986) equality. This is a step up method and it is protected test. This 
procedure is conservative only when the test statistics are independent, because it 
based on the Simes equality for independent p values. It is not always necessary 
to test every possible combination of hypothesis i.e. it can also be used for few 
comparisons. 


The work of Hommel's who generalized Simes procedure that it gives strong 
control of FWE whenever Simes original procedure does achieve weak control 
(e.g. with independent tests). 


Assumptions: 
Test statistics are independent. 


Procedure: 
Reject all hypothesis that have a p value < a/j' where] is defined as 


j=max{# E{L..c} Dorsey > 5 fork -1,.1 
I 


Ifj is non empty, reject H; whenever P;< a/j' with j'=max j. If} is empty, reject all 
AGH 12 0:8) 


AL) 


This procedure includes two stages. The first stage uses the obtained p-values to 
compute the number of members in J. The second stage obtains the significance 
level of rejection using a'=a/}', where j' is the largest number in J. 

Test statistics is same as Holm given in...(6) 


Critical Value 
cn/' ..(12) 


Advantages: 
1. Itprotects the FWE only when test statistics are independent (Dmitrienko et 
al, 2009; Olejnik et al, 1997). 


2. The uniqueness of the Hommel procedure is that it not only considers the 
order of the tests but also takes the obtained p values into the calculation 


while computing the a’. 


Disadvantages: 
1. This method is relatively complicated. 


2. When correlations between variables are negative, the test can sometimes 
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2.5. Hochberg Test :(1988) 

It is a modification of Dunn procedure. This procedure uses critical values 
identical to those used in Holm procedure but provides a potential for increased 
power by conducting the tests in a step-up rather than step down sequence. It isa 
step up method and based on the Simes (1986) equality. 


Assumptions: 
Tests are independent of one another. 


Procedure: 

Hochberg derived an even sharper procedure which uses the ordered p,s but in a 
different way from Holm's procedure. This procedure starts by examining the 
largest p-value p(c). If p(c) <a, then H(c) and all other hypotheses are rejected. If 
not, H(c) is not rejected and one proceeds to compare p,..,, with a/2. If the former 
is smaller, then H,... and all hypotheses with smaller p-values are rejected. 
Generally, one proceeds from highest to lower p-values, retaining Ho, if its p- 
value satisfies p(1) > a/(c —1+ 1). One stops the procedure at the first ordered 
hypothesis when that inequality is reversed. This hypothesis is rejected and all 
hypotheses with lower or equal p-values. This is always a sharper procedure than 
Holm's. 


Critical Value: 


Ye-itl 


Decision procedure: 
Reject H,,to H,,for any i=c,c-1,....Lif 


(13) 


(i) 


(14) 





Advantages: 

1. This procedure has strong control over the FWE oa even if the free 
combination condition is not satisfied (Holm, 1979; Holland & Copenhaver, 
1987; Olejnik et al, 1997). 


2. It controls the FWE under the same conditions for which the Simes global 
test control the Type I error rate. 
3. This method always achieves the same type I FWE control and lower type II 


error rates (Hochberg & Benjamini, 1990). 


4. It has nice characteristic that no adjusted p value can be larger than the 
largest of the unadjusted P values (Wright, 1992). 


5. This method is able to reject at least one individual hypothesis when the 
global null hypothesis is rejected. This property of consonance makes 
Hochberg procedure easy to interpret (Rom, 1990). 

Disadvantages: 


1. It lacks the stability under certain conditions, for example, when the test 
statistics are dependent or correlated (Schochet, 2008). 


2. It only can be applied in the independent hypotheses tests (Olejnik, et al. 
1997; Schochet, 2008). 


2.6. Rom Test (1990): 

It is amodification of Hochberg procedure to increase the statistical power. It is a 
step up procedure. Increased power is achieved by identifying the appropriate 
adjusted significance levels that control the Type I error rate at exactly the 
nominal level when test statistics are independent (Olejnik et al, 1997). 


Assumptions: 
Test statistics are independent. 


Procedure: 

The Rom procedure differs from the Hochberg procedure when the adjusted 
significance level is obtained. Both procedures set a,,,, equal to a and a',,,.,, equal 
to o/2, but the remaining m - 2 adjusted significance levels differ. The adjusted 
significance levels are determined recursively as 


i-l i-2 i ; 
j mJ 
/ — j 
| > ed > la (m-J) (15) 
A m-int =? 
1=1,2,...m 


where o,., =aanda,.,=a/2. 


It is step up procedure with different critical value of c,=a, c,=a/2, c,=a/3 +a°/12 
etc. 
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allow slightly more Type I errors than the stated maximum family wise error. 


First, we denote H,,) as the hypothesis with the largest p-value and H,,, as the 


hypothesis with the smallest p-value. 


(m) 


The testing starts by comparing p,,, with a,,, and stops when p,, < a). Then H_,, to 
Hj.» retained and H,, to H,,, rejected. The computing equation for solving a's can 
be divided into three parts. 


The first part is a'+o'+...0°" and the second part is ®(@) + @(@)_) +--+ OG» ) 


The third part is to solve for a,', which subtract the second part from the first part, 
and divide the difference by I. 


Advantages: 
1. It exactly controls the FWE at a for independent test statistics (Schochet, 
2008). 


2. Itgives motivation of lowering type II error. 


3. The Rom procedure having the desired FWE only for independent test, for 
complex comparison. 


Disadvantages: 
1. Thecalculation of this method is complicated and iterative. 


2. It provides adjusted critical values for up to 10 tests when the overall alpha 
equals 0.05 and 0.01. The numbers of hypothesis test increases, the 
calculations become impractical even when a computer is used. 


3. Comparison: 
The methods discussed above are compared with respect to different aspects like 
Conservatism, Power and Confidence Interval estimation and simulation. 


Conservatism: 

Bonferroni method has the largest p values and thus most conservative methods, 
followed by the Holm (1979), Hochberg (1988), and Hommel (1988) methods. 
The Bonferroni and Holm (1979) methods shows the lowest Type I error, 
whereas the Hochberg (1988) and Hommel (1988) methods allowed more error 
but are still conservative when p (correlation) exceeded 0.5. 


Holm procedure is a closed testing procedure in which each intersection 
hypothesis is tested using a global test based on the Bonferroni procedure. Holm 
procedure rejects the global hypothesis if and only if the Bonferroni procedure 
does and therefore the conclusions regarding the conservative nature of the 
Bonferroni procedure also apply to the Holm procedure (Dmitrienko et al, 2009). 


Hochberg procedure uses the same criterion for each hypothesis as does the 
Holm procedure but tests hypotheses with larger p values first. Consequently this 
procedure will test and possibly reject hypotheses not examined by the Holm 
procedure while rejecting the same hypotheses that are rejected by the Holm 
procedure ((Dunnett & Tamhane, 1992; Hochberg, 1988; Olejnik et al, 1997). In 
most real-life cases, the conclusions from the two methods i.e. Holm & 
Hochberg will rarely differ. 


Power: 

Holm procedure is more powerful than Bonferroni method because the bound for 
this method sequentially increases whereas the Bonferroni bound remains fixed. 
Holm procedure is at least as powerful as Bonferroni because. Statistical power 
is gained by sequentially increasing the criterion for statistical significance. 
Because any hypothesis rejected by the original Bonferroni procedure will also 
be rejected by the Holm procedure, the latter procedure cannot have lower power 
for an individual hypothesis test. However, Holm claims that in actual practice 
the gain in power with his procedure as compared to Bonferroni is non negligible 
because orthant o/(c-i-1) is much larger than o/k for many values of 1 (Olejnik et 
al, 1997). 


Any hypothesis rejected by Holm’s procedure will always be rejected by 
Hochberg’s procedure (Dunnett & Tamhane, 1992; Hochberg, 1988). However, 
the power differences tend to be negligible (Olejnik et al., 1997). Hochberg 
procedure is uniformly more powerful than the Holm procedure (Hochberg, 
1988) but, on the other hand, it is uniformly less powerful than the Hommel 
procedure (Hommel, 1989). However, due to the independence assumption 
required by Hochberg, the Holm procedure may be the best choice if 
independence of tests is not certain. The criterion used by the Holland and 
Copenhaver procedure is slightly larger than Holm procedure thus leading to 
slightly greater power for an individual hypothesis test (Olejnik et al, 1997). 


Hommel method is uniformly more powerful than Holm procedure because the 
Simes test is uniformly more powerful than the global test based on the 
Bonferroni procedure (Dmitrienko et al, 2009). For n>2, there are situation 
where Hommel reject and Hochberg does not reject (Hommel, 1989). Hommel 
procedure rejects more hypotheses than either the Rom or the Holland- 
Copenhaver procedure; however the difference in the number of tests rejected is 
very small. Hochberg and Hommel procedure are more powerful but they are 
known to have the desired FWE only for independent test (Hommel, 1989). Rom 
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gives slightly higher critical p-value that can be used with Hochberg’s procedure, 
making it somewhat more powerful. 


Holm’s procedure is least powerful method, because it is based on the Bonferroni 
inequality. Rom procedure and Hommel procedure are more powerful than 
Hochberg’s procedure due to the fact that sharp inequalities (or equalities) are 
used in both (1.e. Rom & Hommel) procedures; however, the power 
improvement is negligible compared to their complexities. 


The increase in power for individual hypotheses tests provided by the Hommel 
and Rom procedures over the Hochberg approach is at best marginal with the 
Rom procedure having only a slight advantage over the Hommel (Dunnet and 
Tamhane, 1992; Olejnik, 1997). 


Holland Copenhaver and Hochberg procedures provide power very close to that 
obtained by the Hommel and Rom procedures, particularly when the total 
number of hypotheses tested is not too large. If the numbers of false null 
hypotheses are large, Hochberg procedure might provide a better chance of 
detecting all of them than the Holland-Copenhaver procedure. 


As the sample size increases, power of statistics increases but when the number 
of variables in a matrix increases, the probability of rejecting all of the non-null 
hypotheses decreases. All five of the enhancements are more sensitive than the 
original Bonferroni procedure in detecting all true nonzero relationships. The 
difference between the original Bonferroni procedure and the enhancements 
increased as the number of true nonzero relationships increased. Very small 
differences in statistical power are found among the five enhancements to the 
original Bonferroni procedure. The Holm procedure is having the lowest 
sensitivity in detecting all true nonzero relationships, whereas the Rom 
procedure has the greatest power. When all the correlations are nonzero, the 
Hochberg, Hommel, and Rom procedures had the same estimated power. 


Because step up sequential multiple comparisons are based on the Simes 
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equality, which assumes independence of comparisons, it is reasonable to 
suggest that dependence or correlation between the means of groups should 
affect the Type I error control and power (Zweifel, 2014). 


In summary, the comparison of (Bonferroni, Holm, Holland, Hochberg, 
Hommel, Rom), Bonferroni procedure has the lowest percentage of rejections 
and Hommel procedure has the highest percentage of rejections whenever 
differences exist among the procedures. Overall, the SU procedures are little 
more powerful than the SD procedures. Within the SU procedures, whenever 
differences occurred, the Hommel procedure has slightly higher percentage of 
rejections than the Hochberg procedure. Within the SD procedure, whenever 
difference occur, Holland procedure having a slightly higher percentage of 
rejection than Holm procedure (Olejnik et al, 1997). 


Confidence Interval: 

All the methods are step wise methods except Bonferroni so confidence interval 
cannot be obtained by any of the method so comparison is not possible with 
respect to Confidence Interval. 


Simulation Study: 

This section discuss results regarding tests to be reported as adjusted p values 
such that, if the adjusted p value for an individual hypothesis is less than the 
chosen significance level a, then the hypothesis is rejected with FWE not more 
than a. It includes Bonferroni procedure and modification of that procedure by 
Holm, Holland & Copenhaver, Hommel, Hochberg and Rom. 


As aconcrete example, imagine that we have ten p values, and they are (in order 
from smallest to largest) as follows: 0.002, 0.0054, 0.007, 0.008, 0.009, 0.0094, 
0.012, 0.015, 0.028, and 0.067. 


We will compare probability with critical value based on Bonferroni method and 
modification of that procedure by Holm, Holland & Copenhaver, Hommel, 
Hochberg and Rom. 

















































































































Table 1: Rejection criteria according to different available Tests 
No. Prob. | Bonferroni | Holm Holland & Copenhaver Hommel Hochberg Rom 
1 0.0022  —:0.005 0.008 | 0005116197 0.005 —SSs«(.005115 
2 0.0054 0.005 0.005556 0.005683045 0.005556 0.005681 
3 0.007 0.005 0.00625 0.006391151 0.00625 0.006388 
< a 0.008 | 0.005 0.007143 0.007300832 0.007143 0.0073 
j 5) 0.009 | 0.005 0.008333 0.0085 12445 0.008333 0.008505 
6 0.0094 0.005 0.01 0.010206218 0.01 0.0102 
i 0.012 0.005 0.0125 0.012741455 0.0125 OO 
8 0.015 0.005 0.016667 0.016952428 0.016667 0.016875 
9 0.028 0.005 0.025 0.025320566 0.025 0.025 0.025 
10 0.067 0.005 0.05 0.05 0.025 0.05 0.05 























Table 2: Hypotheses Rejection by all these multiple comparison 
procedure 


Holland & 


| No. Bonferroni | Holm 
| | Copenhaver 





'Hommel | Hochberg | Rom 



































1 

Z 

3 

4 Accept Accept Accept Reject Reject Reject 
5 Accept Accept Accept Reject Reject Reject 
6 Accept Accept Accept | Reject Reject | Reject 
7 Accept Accept | Accept Reject Reject | Reject 
8 Accept | Accept Accept | Reject Reject Reject 
9 Accept Accept Accept Accept Accept | Accept 
10 Accept Accept Accept Accept Accept —§ Accept 





From simulation Study also, we can see that Holm procedure is more powerful 
than Bonferroni method because the bound for this method sequentially 
increases whereas the Bonferroni bound remains fixed. Any hypothesis rejected 
by the original Bonferroni procedure will also be rejected by the Holm 
procedure; the latter procedure cannot have lower power for an individual 
hypothesis test. Any hypothesis reyected by Holm's procedure will always be 
rejected by Hochberg's procedure. Hochberg procedure is uniformly more 
powerful than the Holm procedure but, on the other hand, it is uniformly less 
powerful than the Hommel procedure. The criterion used by the Holland and 
Copenhaver procedure is slightly larger than Holm procedure thus leading to 
slightly greater power for an individual hypothesis test. Hommel procedure 
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rejects more hypotheses than either the Rom or the Holland-Copenhaver 
procedure; however the difference in the number of tests rejected is very small. 
Rom gives slightly higher critical p-value that can be used with Hochberg's 
procedure, making it somewhat more powerful. 


Table 3: Comparison of Multiple Comparison procedure 


Test Remarks 


'SS/SW | Based on | Modification of 


Planned contrasts, 
both simple and 
complex. 


comparisons are 
not independent 


: Bonferroni 
Bonferroni ; 5 
inequality 
Bonferroni 


oe inequality 


Bonferroni 












































Holland (1987) BJA) | sues Bonferroni Ose oUnen 
inequality dependence 
Hommel (1988) | SU |. pS Holm wae ana 
inequality are independent | 
Hochberg (1988)| SU |. outtes Holm sas cae Tal 
inequality are independent 
Simes When comparisons 
a ae Inequality nee are independent 
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